Skip to content

fix(foundation): properly escape JSON control characters as \u00XX#527

Open
Ravandevil25 wants to merge 1 commit into
DeusData:mainfrom
Ravandevil25:fix/json-escape
Open

fix(foundation): properly escape JSON control characters as \u00XX#527
Ravandevil25 wants to merge 1 commit into
DeusData:mainfrom
Ravandevil25:fix/json-escape

Conversation

@Ravandevil25

Copy link
Copy Markdown

Describe the bug

In src/foundation/str_util.c, the cbm_json_escape function was handling control characters < 0x20 by completely skipping them instead of properly escaping them as \u00XX. The json_escaped_len function mirrored this logic. While this avoided buffer overflows, it silently stripped control characters, resulting in an invalid JSON representation of the actual data.

Impact

Silently stripping control characters from strings mutates the data. If a file path, git branch, or code snippet contained unusual control characters, they were silently erased from the MCP JSON response rather than safely serialized.

Fix

  • Updated cbm_json_escape to safely format control characters as \u00XX using snprintf(buf + pos, 7, "\\u%04x", c).
  • Added #include <stdio.h> to str_util.c for snprintf.
  • Updated json_escaped_len to correctly reserve 6 bytes (len += 6) for unhandled control characters, aligning the buffer sizing with the formatting string.

Local tests and strict compilation (-Wall -Werror) pass successfully.

Signed-off-by: Saurav Kumar <sauravsk2507@gmail.com>
@DeusData

Copy link
Copy Markdown
Owner

Thanks @Ravandevil25 — the control-character \u00XX escaping logic is correct (no double-escaping, and the guard before the 6-byte write is right). Since this is a foundation serializer used in every MCP response, please add a reproduce-first test: a string containing 0x00–0x1f control characters must serialize to valid JSON with \u00XX escapes (red on current main, green with the fix). With that it's ready. 🙏

@DeusData

Copy link
Copy Markdown
Owner

Thanks @Ravandevil25 — the cbm_json_escape fix in str_util.c is real and valuable (that helper is used in 40+ places). Two things before it can land:

  1. The second hunk targets src/git/git_context.c, which doesn't exist on main or on your branch — the diff looks built against a stale tree. Please rebase on current main and drop that hunk.
  2. Please add the reproduce-first test requested earlier: a control byte (e.g. 0x01) must round-trip to rather than vanish.

One note: the pipeline escapers in pass_definitions.c / pass_parallel.c intentionally degrade control bytes to a space — that's a deliberate, different design, so please leave those as-is rather than touch them. 🙏

@DeusData

Copy link
Copy Markdown
Owner

Correction to my earlier note — apologies, that was my mistake. I'd checked src/git/git_context.c against a stale local checkout. It does exist on current main, and it has the same c < 0x20continue control-char-drop bug that your str_util.c fix addresses (see json_escaped_len around line 318). So your git_context.c hunk is valid and welcome — please keep it; disregard my "drop that hunk" point entirely.

The one remaining item before merge is the reproduce-first test: assert that a control byte (e.g. 0x01) round-trips to rather than vanishing. With that added, this is good to go.

Thanks for the fix, and sorry for the confusion. 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants